Lagrangian relaxation for optimal corpus design
نویسندگان
چکیده
This article is interestedin the problemof the linguisticcontent of a speech corpus. Depending on the target task (speech recognition, speech synthesis, etc) we try to control the phonological and linguistic content of the corpus by collectingan optimal set of sentences which make it possible to cover a preset description of phonological attributes (prosodic tags, allophones, syllables, etc) under the constraint of a minimal overall duration. This goal is classically achieved by greedy algorithms which however do not guarantee the optimality of the desired cover. We propose to call upon the principle of lagrangian relaxation where a set covering problem is solved by iterating between a primal and a dual spaces. We propose to evaluate our proposed methodology against a standard greedy algorithm in order to estimate an optimal phone and diphone covering in French. Our results show that our algorithm based on a lagrangian relaxation principle gives a 10% better solution than a standard greedy algorithm and especially enables to locate the absolute quality of the proposed solution by giving a lower bound to the set covering problem. According to our experiments, our best solution is only 0.8% far from the lower bound of the phone and diphone covering problem.
منابع مشابه
The Lagrangian Relaxation Method for the Shortest Path Problem Considering Transportation Plans and Budgetary Constraint
In this paper, a constrained shortest path problem (CSP) in a network is investigated, in which some special plans for each link with corresponding pre-determined costs as well as reduction values in the link travel time are considered. The purpose is to find a path and selecting the best plans on its links, to improve the travel time as most as possible, while the costs of conducting plans do ...
متن کاملLagrangian Relaxation Method for the Step fixed-charge Transportation Problem
In this paper, a step fixed charge transportation problem is developed where the products are sent from the sources to the destinations in existence of both unit and step fixed-charges. The proposed model determines the amount of products in the existing routes with the aim of minimizing the total cost (sum of unit and step fixed-charges) to satisfy the demand of each customer. As the problem i...
متن کاملModel of Optimal Paths Design for GMPLS Network and Evaluation of Solution
We describe an optimal path design for a GMPLS network that employs the Lagrangian relaxation method, which can be used to estimate the lower bounds of a solution to a problem. This feature assists the designer of the problem to consider the accuracy of the solution obtained by the calculation when deciding whether to assign the solution to a real network in critical situations. A formulation o...
متن کاملComparing performance of different set-covering strategies for linguistic content optimization in speech corpora
Set covering algorithms are efficient tools for solving an optimal linguistic corpus reduction. The optimality of such a process is directly related to the descriptive features of the sentences of a reference corpus. This article suggests to verify experimentally the behaviour of three algorithms, a greedy approach and a lagrangian relaxation based one giving importance to rare events and a thi...
متن کاملpth Power Lagrangian Method for Integer Programming
When does there exist an optimal generating Lagrangian multi-plier vector (that generates an optimal solution of an integer programming problem in a Lagrangian relaxation formulation), and in cases of nonexistence, can we produce the existence in some other equivalent representation space? Under what conditions does there exist an optimal primal-dual pair in integer programming? This paper cons...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2007